68 research outputs found

    On the Effectiveness of Unit Tests in Test-driven Development

    Get PDF
    Background: Writing unit tests is one of the primary activities in test-driven development. Yet, the existing reviews report few evidence supporting or refuting the effect of this development approach on test case quality. Lack of ability and skills of developers to produce sufficiently good test cases are also reported as limitations of applying test-driven development in industrial practice. Objective: We investigate the impact of test-driven development on the effectiveness of unit test cases compared to an incremental test last development in an industrial context. Method: We conducted an experiment in an industrial setting with 24 professionals. Professionals followed the two development approaches to implement the tasks. We measure unit test effectiveness in terms of mutation score. We also measure branch and method coverage of test suites to compare our results with the literature. Results: In terms of mutation score, we have found that the test cases written for a test-driven development task have a higher defect detection ability than test cases written for an incremental test-last development task. Subjects wrote test cases that cover more branches on a test-driven development task compared to the other task. However, test cases written for an incremental test-last development task cover more methods than those written for the second task. Conclusion: Our findings are different from previous studies conducted at academic settings. Professionals were able to perform more effective unit testing with test-driven development. Furthermore, we observe that the coverage measure preferred in academic studies reveal different aspects of a development approach. Our results need to be validated in larger industrial contexts.Istanbul Technical University Scientific Research Projects (MGA-2017-40712), and the Academy of Finland (Decision No. 278354)

    OECD Recommendation's draft concerning access to research data from public funding: A review

    Get PDF
    Sharing research data from public funding is an important topic, especially now, during times of global emergencies like the COVID-19 pandemic, when we need policies that enable rapid sharing of research data. Our aim is to discuss and review the revised Draft of the OECD Recommendation Concerning Access to Research Data from Public Funding. The Recommendation is based on ethical scientific practice, but in order to be able to apply it in real settings, we suggest several enhancements to make it more actionable. In particular, constant maintenance of provided software stipulated by the Recommendation is virtually impossible even for commercial software. Other major concerns are insufficient clarity regarding how to finance data repositories in joint private-public investments, inconsistencies between data security and user-friendliness of access, little focus on the reproducibility of submitted data, risks related to the mining of large data sets, and sensitive (particularly personal) data protection. In addition, we identify several risks and threats that need to be considered when designing and developing data platforms to implement the Recommendation (e.g., not only the descriptions of the data formats but also the data collection methods should be available). Furthermore, the non-even level of readiness of some countries for the practical implementation of the proposed Recommendation poses a risk of its delayed or incomplete implementation

    Effects of Test-Driven Development : A Comparative Analysis of Empirical Studies

    Get PDF
    Test-driven development is a software development practice where small sections of test code are used to direct the development of program units. Writing test code prior to the production code promises several positive effects on the development process itself and on associated products and processes as well. However, there are few comparative studies on the effects of test-driven development. Thus, it is difficult to assess the potential process and product effects when applying test-driven development. In order to get an overview of the observed effects of test-driven development, an in-depth review of existing empirical studies was carried out. The results for ten different internal and external quality attributes indicate that test-driven development can reduce the amount of introduced defects and lead to more maintainable code. Parts of the implemented code may also be somewhat smaller in size and complexity. While maintenance of test-driven code can take less time, initial development may last longer. Besides the comparative analysis, this article sketches related work and gives an outlook on future research.Peer reviewe

    The Importance of the Correlation in Crossover Experiments

    Get PDF
    Context: In empirical software engineering, crossover designs are popular for experiments comparing software engineering techniques that must be undertaken by human participants. However, their value depends on the correlation ( r ) between the outcome measures on the same participants. Software engineering theory emphasizes the importance of individual skill differences, so we would expect the values of r to be relatively high. However, few researchers have reported the values of r . Goal: To investigate the values of r found in software engineering experiments. Method: We undertook simulation studies to investigate the theoretical and empirical properties of r . Then we investigated the values of r observed in 35 software engineering crossover experiments. Results: The level of r obtained by analysing our 35 crossover experiments was small. Estimates based on means, medians, and random effect analysis disagreed but were all between 0.2 and 0.3. As expected, our analyses found large variability among the individual r estimates for small sample sizes, but no indication that r estimates were larger for the experiments with larger sample sizes that exhibited smaller variability. Conclusions: Low observed r values cast doubts on the validity of crossover designs for software engineering experiments. However, if the cause of low r values relates to training limitations or toy tasks, this affects all Software Engineering (SE) experiments involving human participants. For all human-intensive SE experiments, we recommend more intensive training and then tracking the improvement of participants as they practice using specific techniques, before formally testing the effectiveness of the techniques

    Empirical Investigation on Agile Methods Usage: Issues Identified from Early Adopters in Malaysia

    Get PDF
    Agile Methods are a set of software practices that can help to produce products faster and at the same time deliver what customers want. Despite the benefits that Agile methods can deliver, however, we found few studies from the Southeast Asia region, particularly Malaysia. As a result, less empirical evidence can be obtained in the country making its implementation harder. To use a new method, experience from other practitioners is critical, which describes what is important, what is possible and what is not possible concerning Agile. We conducted a qualitative study to understand the issues faced by early adopters in Malaysia where Agile methods are still relatively new. The initial study involves 13 participants including project managers, CEOs, founders and software developers from seven organisations. Our study has shown that social and human aspects are important when using Agile methods. While technical aspects have always been considered to exist in software development, we found these factors to be less important when using Agile methods. The results obtained can serve as guidelines to practitioners in the country and the neighbouring regions

    Software defect prediction: do different classifiers find the same defects?

    Get PDF
    Open Access: This article is distributed under the terms of the Creative Commons Attribution 4.0 International License CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.During the last 10 years, hundreds of different defect prediction models have been published. The performance of the classifiers used in these models is reported to be similar with models rarely performing above the predictive performance ceiling of about 80% recall. We investigate the individual defects that four classifiers predict and analyse the level of prediction uncertainty produced by these classifiers. We perform a sensitivity analysis to compare the performance of Random Forest, Naïve Bayes, RPart and SVM classifiers when predicting defects in NASA, open source and commercial datasets. The defect predictions that each classifier makes is captured in a confusion matrix and the prediction uncertainty of each classifier is compared. Despite similar predictive performance values for these four classifiers, each detects different sets of defects. Some classifiers are more consistent in predicting defects than others. Our results confirm that a unique subset of defects can be detected by specific classifiers. However, while some classifiers are consistent in the predictions they make, other classifiers vary in their predictions. Given our results, we conclude that classifier ensembles with decision-making strategies not based on majority voting are likely to perform best in defect prediction.Peer reviewedFinal Published versio
    corecore